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REMARKS 



This Preliminary Amendment is submitted to improve the form of the English translation 
as filed. It is respectfully requested that this Preliminary Amendment be entered in the above- 
referenced application. 

In accordance with the foregoing, claims 1-11 have been canceled and claims 12-26 
have been added. Thus, claims 12-26 are pending and are under consideration. 

A substitute specification is also being filed herewith. The substitute specification is 
accompanied by a marked-up copy of the original specification. 

If there are any questions regarding these matters, such questions can be addressed by 
telephone to the undersigned. Otherwise, an early action on the merits is respectfully solicited. 

If there are any additional fees associated with filing of this Preliminary Amendment, 
please charge the same to our Deposit Account No. 19-3935. 



Respectfully submitted, 



STAAS & HALSEY LLP 



Date: 





Richard A. Gollhofer 
Registration No. 31,106 



1201 New York Ave, N.W., Suite 700 
Washington, D.C. 20005 
Telephone: (202)434-1500 
Facsimile: (202)434-1501 
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MARKED-UP COPY OF SUBSTITUTE SPECIFICATION 

D e scr i pt i on TITLE OF THE INVENTION 

VOICE ACKNOWLEDGEMENT IN THE CASE OF SPEAKER-INDEPENDENT NAME DIALING 
CROSS REFERENCE TO RELATED APPLICATIONS 

TOOOn This application is based on and hereby claims priority to German Application No. 
10311698.2 filed on March 17, 2003, the contents of which are hereby incorporated bv 
reference. 

BACKGROUND OF THE INVENTION 

[0002] The technology of speech recognition for mobile terminals is now so far advanced that 
it is possible to implement dialing by name independent of the speaker (Speaker Independent 
Name Dialing). In this respect, entries in the address book can be dialed directly by speaking 
the entered name, without training of the voice pattern having to be carried out with the user in 
advance. 

[0003] The handsfree mode is restricted in such a form of speech recognition, however, since 
the user is reliant on the acknowledgment on the display for verification of the recognition result 
and receives no acoustic acknowledgment of the recognized entry. 

[0004] To implement an acoustic acknowledgment for speaker-independent name dialing, it is 
currently assumed that text-to-speech (TTS) components have to be used. These TTS 
components generate a synthetic voice output from a text. The recognized name entry in an 
address book can be output in synthesized form by th i s means . However, the TTS components 
which have to be used need a level of computing performance which is high for mobile terminals 
and embedded hardware and also have a large memory requirement, and can therefore only be 
implemented in a very cost-intensive manner. Furthermore, the voice quality of such TTS 
systems for mobile devices is of a low level due to the small footprint. Moreover, foreign names 
are often pronounced in unfamiliar and incorrect ways by TTS systems. 
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SUMMARY OF THE INVENTION 

[0005] On th i s basis, th e A n object underlying the invention is that of implementing a voice 
acknowledgment for a recognized voice input using the least possible resources. Th i s obj e ct is 
achi e v e d by th e i nv e ntions sp e c i f ie d in th e i nd e p e nd e nt claims. Advantag e ous e mbodim e nt s ar e 
s e t down in th e subclaims. 

[0006] Accordingly, in a method for speech recognition, especially on embedded hardware 
and/or a mobile terminal, a first voice signal is input by a user by speaking it in. The designation 
"first" voice signal merely serves the purpose of differentiating the voice signal of this text from 
further, subsequent voice signals. The inputted first voice signal is recognized, by assigning it to 
a recognition entry, and recorded, by storing data in memory for the acoustic restoration of the 
voice signal which is needed for the acoustic representation of the voice signal. Finally, the 
recording of the inputted first voice signal is stored in memory as being assigned to the 
recognition entry. This means that it is available for later recognitions as a confirmation signal in 
the form of a voice acknowledgment. 

[0007] The recording of the inputted first voice signal is preferably only stored in memory as 
being assigned to the recognition entry if it is confirmed by the user that the inputted first voice 
signal has been recognized correctly. Alternatively, or additionally, the storage in memory of a 
voice signal which has been erroneously assigned to a recognition entry can also be deleted 
again later. 

[0008] Prior, especially, to the confirmation that the inputted voice signal has been 
recognized correctly, a visual representation of the recognition entry can be output on a display. 
This means that the user can read the visual representation of the recognition entry and then 
confirm that the voice signal has been recognized correctly. 

[0009] Following the storage in memory and recognition of the original voice signal, speech 
recognition operations for further voice signals which are identical or similar to the first voice 
signal are structured as follows: a further voice signal is input by the user. The further inputted 
voice signal is recognized by assigning it to the recognition entry. Finally, the recording of the 
inputted first voice signal stored in memory as being assigned to the recognition entry is output 
acoustically for the purposes of confirming that the further inputted voice signal has been 
recognized as the recognition entry. 
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[0010] Additionally to the automatic assignment and storage in memory of voice signals 
described above, the user can be given the opportunity to record voice signals and assign them 
manually to recognition entries explicitly himself. To this effect, a desired voice signal is capable 
of being input and stored in memory in association with a further recognition entry without 
intervening speech recognition. 

[0011] The method especially constitutes a method for speaker-independent name dialing. 
However, it can also be applied to all other application areas of speech recognition, especially 
speaker-independent speech recognition, where a voice acknowledgment is needed for the 
purposes of implementing a "Full Handsfree" mode, such as in Command & Control; in Voice 
Links, especially in Internet navigation; in voice-based selection of applications (Speech 
Application Selection) and/or in voice-based input of city and street names (City Name Input), 
for example. 

[0012] A device which is set up and displays resources to execute the outlined method can 
be implemented by m e ans of th e corr e spond i ng appropriate programming and setting up of a 
data processing system, for example. In this respect, the device especially displays resources 
for inputting the voice signal, resources for recognizing the voice signal by m e ans of assignment 
to a recognition entry, and memory resources in which the inputted voice signal is capable of 
being stored in association with the recognition entry. Advantageous embodiments of the device 
result in a similar manner to the advantageous embodiments of the method. 

[0013] The device especially constitutes a mobile terminal, and preferably a mobile 
communication facility, possibly in the form of a mobile telephone and/or PDA or a mobile 
navigation facility in the form of a navigation system in a vehicle. 

[0014] A program product for a data processing system which contains blocks of code with 
which one of the outlined methods can be executed on the data processing system can be 
executed by m e ans of suitable implementation of the method in a programming language and 
translation into code which can be executed by the data processing system. The blocks of code 
are stored in memory to this effect. In this respect, 'a program product' meaes -indicates that the 
program as-is_a commercial product. It may exist in any desired form ;: thus, for example, on 
paper, a computer-readable data medium or distributed across a network. 

[0015] Further advantages and features of the invention arise from the description of an 
exemplary embodiment. 
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[0016] The invention makes it possible to implement a voice acknowledgment inexpensively 
in a step-by-step process without the use of TTS components in the case of speaker- 
independent name dialing. 

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT 

[0017] To th i s e ff e ct A ccording to the invention , a name spoken by a user is, in the case of a 
voice dialing operation, not only fed to the speech recognition unit, but is additionally also 
sampled as a stored speech segment in parallel. In the case of the first name dialing operation 
for an address book entry, the name entry recognized by the speech recognition unit is 
displayed to the user visually on the screen. Furthermore, the user is requested acoustically, 
with the aid of a tone, to confirm the recognition result. If the user confirms the result, the 
recognized address book entry is dialed and the recording of the inputted voice signal, in the 
form of the recorded stored speech segment, is assigned to the recognition entry, in the form of 
the address book entry. In the case of every further name dialing operation for that entry, the 
assigned stored speech segment can then also be used as a voice acknowledgment alongside 
the visual acknowledgment. This means that the user is informed of the recognition result both 
visually and also acoustically. This allows a Full Handsfree mode to be achieved which 
possesses correct, high-quality voice reproduction. The reliably assigned stored speech 
segment of the user makes it possible in this respect to dispense with the cost-intensive TTS 
component. 

[0018] The invention is therefore founded on a self-initiating system which is based on the 
combination of the voice sampling in the course of speech recognition and the reliable 
assignment of a voice sample by means of th e confirmation of the recognition result. 

[0019] This should be explained again with reference to a more concrete exemplary 
embodiment. In a mobile phone, functions of speaker-independent name dialing are 
implemented by using a speaker-independent, HMM-based speech recognition unit. All the 
names in the user's address book are made known to the speech recognition unit by way of a 
grapheme-to-phoneme technology and can therefore be dialed direct by voice. 

[0020] In the initial state of the system, there are no stored speech segments in association 
with the address book entries. Upon activation of the functionality for speaker-independent 
name dialing, the name spoken by the user is fed to the speech recognition unit and sampled as 
a stored speech segment in parallel. The speech recognition unit returns the recognition result 
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and a check is carried out as to whether a stored speech segment is already present in 
association with the recognition result. 

[0021] If there is no stored speech segment as yet, the recognition result is displayed on the 
screen and the user is requested, with the aid of a voice prompt such as "Confirm recognition" 
or "Dial", for example, to confirm the recognition result. If the result is confirmed by m e ans use 
of the "Dial" key, the stored speech segment is assigned to the address book entry and the 
number is dialed. If the result is not confirmed by meaf^s -use of the "Cancel" key, the stored 
speech segment is deleted and a dialing operation is not carried out. 

[0022] If a stored speech segment is already assigned in association with a recognized 
address book entry, this is played to the user as well as the screen display. The dialing 
operation is then started up automatically. The voice acknowledgment (Voice Feedback) 
provides the user, even in handsfree operation, with the opportunity to check simply whether the 
recognition result is correct. During the ongoing dialing operation, the user is normally left with 
enough time to still cancel the dialing operation in the event of an incorrect recognition. 

[0023] Additionally to the automatic assignment of stored speech segments described above, 
the user can be offered the opportunity to record and manually assign stored speech segments 
explicitly himself. 

[0024] If a plurality of users use a device, user profiles can be created where a user's own 
speech segments are stored in the respective profile for each user individually. This allows a 
mixture of voices to be avoided and a homogeneous acoustic sound pattern to be achieved. 

[0025] The invention has been described in detail with particular reference to preferred 
embodiments thereof and examples, but it will be understood that variations and modifications 
can be effected within the spirit and scope of the invention covered by the claims which may 
include the phrase M at least one of A, B and C" as an alternative expression that means one or 
more of A, B and C may be used, contrary to the holding in Superguide v. DIRECTV, 
69 USPQ2d 1865 (Fed. Cir. 2004). 
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